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Abstract 


We consider the minimization problems of the form P{ip,g, h): min{/(x) = 
(p{x) + g{x) — h{x) : x E M"'}, where is a differentiable function and g, h are 
convex functions, and introduce iterative methods to finding a critical point of 
/ when / is differentiable. We show that the point computed by proximal point 
algorithm at each iteration can be used to determine a descent direction for the 
objective function at this point. This algorithm can be considered as a combi¬ 
nation of proximal point algorithm together with a linesearch step that uses this 
descent direction. We also study convergence results of these a' rithms and 
the inertial proximal methods proposed by P.E. Mainge et.al. under the 



main assumption that the objective function satisfies the Kurdika-Lojasiewicz 
property. 

Keywords: DC programming, Kurdyka-Lojasiewicz inequality, proximal map¬ 
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1 Introduction 

Let M” be an n-dimensional Euclidean space with the inner product (•, •) and the associated 
norm || • ||. Let / : M” —>■ MU{-|-oo} be a nonconvex function such that / can be decomposed 
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in the form 


f{x) = ip{x)+g{x) - h{x), 

where ip : M" —)• M is a continuously differentiable function (not necessarily convex) and 
g, h \ M"' —)• Mu {+00} are convex, proper lower semicontinuous functions. We consider the 
following optimization problem 

P((^, g, h) : min {f{x) = ip(x) + g{x) — h{x), x E M”}. 


The problem in this form has been investigated in some recent papers, such as P.E. Mainge 
and A. Moudafi and N.T. An and N.M. Nam B- The special structure of this problem 
allows us the use of the powerful tools in convex analysis and convex optimization problem. 
The differentiability of ip and the convexity of g and h of the objective function would be 
employed to develop appropriate tools from both theoretical and algorithmic point of views. 


When ip is convex [ip = 0), the function / is called DC function (Difference of two 
Convex function). It is worth mentioning that the class of DC functions contains all lower 
function and is closed under all operations usually considered in optimization. Some 
interesting optimality conditions and duality theorems related to DC programs are given, 
for example. 


see 


14, m3 


For solving DC programs from the convex analysis approach, DC algorithms (DCA), 
based on local optimality conditions and duality in DC programming, have been introduced 
by T. Pham Dinh in 1986 as an extension of the subgradient algorithms to DC pro¬ 
gramming and extensively developed by H.A. Le Thi and T. Pham Dinh 3^ since 1994. 
Since then many authors have contributed to providing mathematical foundation for the 
algorithm and making it accessible for application, see 17-^, 331 and references quoted 

therein. 


One of the most important method to handle ill-posed problems is proximal point 
method. This method was first introduced by Martinet 23| in 1970 for solving convex 
minimization problems and then extensively developed by R.T. Rockafellar 37j to finding a 
zero point of a maximal monotone inclusion. Since then, many researchers have succeeded 
in applying this method for solving many other problems, such as variational inequality 


problems [l3(] , equilibrium problems [25( . Sun et. al. [3^ (see also (a, luJ, l26|, for more 
proximal point algorithms) and N.T. An et.al. B applied the proximal point method to 
DC optimization and problem P{(p,g,h), respectively. 


It is obvious that with a suitable DC decomposition, DCA becomes a proximal point 
algorithm. Very recently, Artacho et.al. B introduced the so-called boosted DC algorithm 
with backtracking for solving differentiable DC programming. This method can be con¬ 
sidered as a combination of DCA and the descent algorithm proposed by Fukushima-Mine 


15|, [ 2 ^ to force the value of the objective function at each iteration reduces more than that 


performed by DCA. 


Along with proximal point algorithm the inertial proximal method was proposed 
for solving P{ip,g,h) by P.F. Mainge and A. Moudah (see also BB more inertial 
proximal methods). Although this algorithm has been known since 2008, its convergence 
analysis for general classes of difference functions, in our best knowledge, is still an open 
research question. 


In this paper, first we introduce an algorithm called boosted proximal point algorithm 
to finding a stationary point of P{ip,g,h) when / is differentiable. This algorithm can be 
seen as a combination of the proximal point method and the descent algorithm proposed by 
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Fukushima-Mine 1^ to make the value of the objective function / at each iteration reduce 
much more than that in the proximal point algorithm. The global convergence of the 
proposed algorithm and its convergence rate are obtained under the main assumption that 
the objective function satisfies Lojasiewicz inequality 2l|. Based on the method developed 
recently in [^, we then prove the global convergence of the inertial proximal algorithm for 
P{if,g,h) provided that / posses the Kurdyka-Lojasiewicz property. 


The rest of paper is organized as follows. In Section 2, some tools of variational analysis 
are recalled. The boosted proximal point algorithm for P{ip, g, h) and its convergent analysis 
is presented in Section 3. The last Section is devoted to presenting the convergence of the 
inertial proximal algorithm for P{ip,g,h). 


2 Preliminaries 


This section contains the necessary preliminaries needed throughout the paper. We start 
with generalized differentiation for nonsmooth functions referring the reader to the books 
0, [ 3 ^ for more details and commentaries. 


Let us denote the nonnegative orthant in R” by R” = [0, 00 )"' and B(x, r) the closed 
ball of center x and radius r > 0. The gradient of a differentiable function / : R” —>■ R”^ at 
some point x G R"’ is denoted by V/(x) G R"’^™. 

For an extended-real-value function / : R”’ —>■ R := R U {-l-oo}, the domain of / is 
the set domf = {x G R"" | f{x) < -boo}, moreover / is said to be proper if its domain is 
nonempty, and / is said to be coercive if f{x) —>■ -boo, whenever ||x|| —>■ -boo. 


Let n C R” and / 0 , we use the notation d{x', Q) to denote the distance from x to 
n, i.e., 

d(x: n) = inf llx — xll. 


Recall that a function / : R” ^ R is said to be strongly convex with r > 0 if 

/(Ax -b (1 - X)y) < A/(x) -b (1 - A)/(y) - ^A(l - A)||x - yf, 

for all x,y G R” and A G (0, 1). Moreover, if r = 0, / is said to be convex. Clearly, / is 
strongly convex if and only if / — -HI • |p is convex. 

The function / is called Lipschitz continuous if there exists a positive constant L such 
that 

\\f{x) - f{y)\\ < L\\x - y\\, for all x,y G R”. 

Further, / is called locally Lipschitz continuous if for every x G R”, there exists a neighbor¬ 
hood R of X such that / restricted to V is Lipschitz continuous. 

f 

Given a lower semicontinuous function / : R"' —R, we use the symbol 2; —> x to indicate 
that 2; —>■ X and f{z) fix). The Frechet subdifferential of / at x G domf is defined by 

a"/(X) = {. € R” I lim inf nn-m-iy-i) > 0, 

x^x — a;|| 

Set fix) = 0 if X ^ domf. It’s worth noting that the Frechet subdifferential mapping 
does not have a closed graph, and it is unstable computationally. Based on the Frechet 
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subdifferential, the limiting subdifferential of / at x E domf (known also as the general, or 
basic, or Mordukhovich subdifferential) is defined by 

f{x) = linisup9'^/(x) = {u E M” | Ax, E d^f{x^), x}. 

A- 

x—fx 

Set also f{x) = 0 if x 0 domf. It follows from the definition of the following rubst- 
ness/closedness property of f: 

{x E I 3x^ Ax, A ^ V, / E a^/(x^)} = d^f{x). 

Observe that from Theorem 8.6 in [^, one has f{x) C f{x) for every x E where 
the first set is closed and convex while the second one is closed. If / is differentiable at 
X, then f{x) = {V/(x)}, and if / is continuously differentiable on a neighborhood of x, 
then f{x) = {V/(x)}. For convex function /, the Frechet and limiting subdifferentials 
reduce to the classical subdifferential in the sense of convex analysis: 

df{x) = {u E M” I {v,x — x) < /(x) — /(x), Vx E M"'}. 

It’s necessary to mention another subdifferential named the Clarke subdifferential defined 
in [l^, which was based on generalized directional derivatives, and it’s also worth noting 
that Clarke subdifferential of a locally Lipschitz continuous function / around x can be 
represented as the limiting subdifferential: f{x) = cod^^ f{x), where cokl denotes the 

convex hull of an arbitrary set kl. 


Proposition 2.1 nSal. pp.304) Let f = g + h where g is lower semicontinuous and h is 
continuously differentiable on a neighborhood of x. Then 


f {x) = g{x) + Vh{x) and f {x) = g{x) + Vh{x) 




M r/-\ 




Proposition 2.2 (^3^], pp.422) If a lower semicontinuous function f : M"" M has a local 
minimum at x ^ domf, then 0 E d^f{x) C f{x). In the convex case, this condition is 
not only necessary for a local minimum but also sufficient for a global minimum. 


Note that for a finite convex function / on M"", if y^ E dh{x^) for all k and {x*’} is 
bounded, then the sequence is also bounded, one can refer Definition 5.14, Proposition 
5.15 and Theorem 9.13 in 38|. 


To establish our convergence results, we need the Kurdyka-Lojasiewicz property (briefly 
say K-L property) defined as follows (see also [H, 0, l^)- Before that, let us recall the 
definitions of semi-algebraic set and function. A subset kl of M"" is called semi-algebraic if 
it can be represented as a finite union of sets of the form 


{x E M”" I pi{x) = 0, qfx) < 0, for all i = 1, - ■ ■ , m}. 


where pi and qi for i = 1, • • • ,m are polynomial functions. A function / is said to be 
semi-algebraic if its graph is a semi-algebraic subset of 


Definition 2.1 A lower semicontinuous function f : M"" —)• M satisfies the K-L property 
at X* E domd^ f if there exist e > 0, a neighborhood U of x* and a continuous concave 
function 9 : [0, e[—[0,-|-oo[ with 
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(a) 0(0) = 0, 

(b) 9 > 0 on ]0, e[, 

(c) 9 is of class on ]0, e\, 

(d) for every x G U and f{x*) < f{x) < f{x*) + e, one has 

O'(fix) - fix*))d{0-,d^ fix)) > 1. 

It’s known that a proper lower semicontinuous semi-algebraic function always satisfies the 
K-L property, one can see 0, 0- Moreover, / is said to satisfy the strong Kurdyka- 
Lojasiewicz property at x* if (a)~(d) hold for Clarke subdifferential fix). In fact, very 
recently. Theorem 14 in 0 pointed out that the class of definable functions, which contains 
the class of semi-algebraic functions, satisfies the strong Kurdyka-Lojasiewicz property at 
each point of domd^ f. 

In virtue of Lemma 2.1 in 0], we know that a proper lower semicontinuous function 
/ : M”- —)• M has the K-L property at any point x G such that 0 ^ fix). If the 
function / : M” —>■ M is differentiable and 0(f) = Mt^, where M > 0 and k G [0,1), then we 
have the following definition which is one special case of Definition 12.11 

Definition 2.2 If f : M” —)• M fs a differentiable function, then f is said to have the 
Lojasiewicz property if for any critical point x, there exist constants M > 0, e > 0 and 
K G [0,1) such that 


|/(x) -/(x)l''< M||V/(x)||, for all X G Mix, e), (2.1) 

where we adopt the convention 0® = 1 and the constant k is called Lojasiewicz exponent of 
f at X. 


A differentiable function / : M” —M is said to be real analytic if for every x G M”, / could 
be represented by a convergent power series in some neighbourhood of x. In addition, 21[ 
showed that every real analytic function / : M”" —)• M satisfies the Lojasiewicz property with 
exponent k G [0,1). 


We will also employ the following useful lemma to obtain bounds on the rate of con¬ 
vergence of the sequence generated by A^orithm 1 (see the next Section). This lemma 
appears within the proof in Theorem 2 of (6| for specific values of /r and v. For convenience, 
we give a brief proof, see also Lemma 3.1 in 0] or Theorem 3.3 in [0 . 


Lemma 2.1 Let {f^} he a sequence in M+ and let p. and v be some positive constants. 
Suppose that —)• 0 and the sequence satisfies 

tk — — tk+i), for all k sufficientlly large. (2-2) 


Then 

(a) If fj, = 0, the sequence {f^} converges to 0 in a finite number of steps. 

(b) If pL G (0,1], the sequence {tff\ converges linearly to 0 with rate 1 — 


5 



(c) // /X > 1, there exists 7 > 0 such that for all k sufficiently large 


tk < jk r-i. 

Proof, (a) If /X = 0, then from (j2.2l) . one has 0 < tk+i < tfc — 7 , which implies (a). 

(b) Now suppose that ^ G (0,1]. As t^ 0, we obtain that t^ < 1 for all k large 
enough, then from (j 2 . 2 [l . 

^{tk - 4 + 1 ), 

thus, 4+1 < (1 — 7 ) 4 , which means that {tfc} converges to 0 with rate 1 — 7 linearly. 

(c) Assume that fj, > 1, if tk = 0 for some k, then from (12.2h . we have 4+1 = 0, which 
points out that the sequence { 4 } converges to zero in a finite number of steps and (c) 
trivially holds. Therefore we can assume that tk > 0,VA:. 

Consider the decreasing function u : (0, + 00 ) —>• M defined by u{t) = t~^. Assume that 
(ESI) holds for all k > for some positive integer then for k > we get 

- < (4 - 4 +i)w( 4 ) < / u{t)dt = 

^ Jtk + l ^ h' 

Since fi > 1, then 

t]ffi — t]f~^ > — -, for all k > N. 

Summing up k from N to j — 1> N, one has 

7" - ‘ir" > —0 - w). 

which gives 

tj < + ~—~iJ ~ all j > N + 1. 

1 

As a conclusion, there exists 7 > 0 such that tk < ^k for all k sufficiently large. □ 


Before moving to the next Section, let us recall the following lemma regarding an upper 


bound for a smooth function with Lipschitz continuous gradient, see 28l. l2' 


Lemma 2.2 If g : W is a differentiable function with L-Lipschitz gradient, then for 

all x,y £ M"', one has 

9{y) <g{x) + {Vg{x),y-x) + ^\\y-xf. (2.3) 

3 Boosted Proximal Point Algorithm 

Let us introduce our first algorithm to solve P{(p,g, h). 

Algorithm 1 

Initialization. Pick x^ G M"', choose parameters rj G (0,1), a > 0, {Afc} C (0, + 00 ). 
Iteration A: (A: = 0, 1, 2, • • •). Having x^ do the following steps: 
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Step 1. Solve the following strongly convex program 

min{5((x) — (V/i(x^) — V(p{x^),x — x^) + ^||a; — 
to get the unique solution . 

Set — x^. If = 0, then stop. Otherwise, go to Step 2. 

Step 2. (Armijo linesearch rule) 

Find rrik as the smallest positive integer number m such that 
f{y'^ + y-^d^)<f{y^)-ay^\\d^f. 

Set rjk = and x^~^^ = y^ + rjkd^ and go to Iteration k with k replaced by A: + 1. 

Theorem 3.1 Suppose that V(/?(x) is Lipschitz continuous with constant Li and Li + 2\ < 
Afc < A where A > 0, then 

(a) f{y’^)<f{x^)-^^^^\\y^-x^f. 

(b) The linesearch is well defined. 

(c) < /(x^) — + a? 7 fc)||d^|p, the sequence {/(x^)} is strictly decreasing and 

convergent. 

(d) Any accumulation point o/{x^} is a stationary point of f. 

oo oo 

(e) < oo and ^ ||x^’''^ — x^|p < oo. 

k=0 k=0 

Proof, (a) From Algorithm 1, we have 

g{x’^) > g{y^) - (v/.(x'=),/ - x'^) + (V^(x'^),y'= - x^) + - x’^f. 

In addition, by convexity of h, 

h{y’^) > h{x^) + {Vhix^),y^ - x^), 
and Vy?(x) is Lipschitz with constant Li, by Lemma 12.21 

<^(/) < + (V(/,(x"), / - x") + ^\\y^ - x^f. 

Combining the above inequalities, one has 

/(/)</(x'^)-^^^||/-x^f. 

(b) We prove by contradiction. If Vm > 1, we get 

+ >/(/), Vm 

4^ /(/ + r?™d")-/(/) >0, Vm 
^ {Vf{y^),y^d^)+oir^n>0, Vm 

4^ (V/(/),/) + ^^>0 

^ (V/(/),d'=) >0. (3.4) 
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Beside that, from Step 1 in Algorithm 1, we have 


and 

then 

Hence 


Vg{y^) - Vh{x^) + - x^) = 0, 

V/(/) = + V<7(/) - V/.(/), 

V/(/) = -Vh{y^) + Vh{x^) + V<^(/) - V^{x^) - Afc(/ - x^). 


{Vf{y^),d^) = -{Vh{y^) - Vh{x’^),d^) + (V<^(/) - Vip{x’^),d^) - Xk\\y’^ - x^f 
<-{\,-Li)\\y^-x’^f<0. 

This contradicts to ()3.4p . So the linesearch is well defined. 

(c) From Step 2 in Algorithm 1, one has, 

/(x'=+')</(/)-ar?fc||d^||2 

< fix^) - - avkWd’^f 

= f{x^) - o + Oidk)\\dl"f- 


(3.5) 


Hence {f{x^)} is a strictly decreasing sequence, and combining with inf /(x) > —oo, we 


a:GR" 


get lim f{x^) does exist. 
k^oo 

(d) Since {f{x^)} is convergent, then 


from (|3.5p . one has 


fix^+^) - fix’^) ^ 0, 
Wd’^f = 11/ ^ 


0 . 


Let X* be any accumulation point of {x^} and let x^' be a subsequence of {x*^} converging 
to X*. Since ||/® — x^*|| —)• 0, one has 

lim /* = X*. 

2^00 

Step 1 in Algorithm 1 yields 

V£/(/0 - (V/i(x^0 - V<^(x'=0) + Afc||/‘ - x^'ll = 0, 

letting z —>■ oo, we get from the above inequality that 

Vg{x*) - Vh{x*) + V(p{x*) = 0, 

which means that x* is a stationary point of /. 

(e) Observe, from (|3.5p . one has 

,Xk — Li 


(- 




2 


(3.6) 





Summing up (13.61) from 0 to N, we get 


N 


+ ar]k)\\d'^f < /(x°) - < /(x°) - injf^ f{x), 


k=0 




since Afc > Li + 2A, then taking the limit as N ^ oo, the above inequality becomes 

OO OO . ^ 

^A||d^f < X]( % ^ +ar]k)\\d^f < /(x°) - inf /(x) < oo. 

A:=0 k=0 


Thus 


In addition, 


<oo. 


k=0 


^k+l _^k^yk^ y^^k _ ^ ^ 


SO 


y~] - x^lp = ^(1 + < ^(1 + < oo. 


A:=0 


k=0 


k=0 


□ 


Remark 3.1 The main difference of Alqorithm 1 and the proximal point algorithm pro¬ 
posed by N.T. An et.al. in recent paper jj/ is at Step 2. In Algorithm 1, we use df as the 
descent direction at y^. In addition, x^~^^ = y^ -I gkd^ = + (1 + gk)d^ and 

and 

/(x'^+i) < /(/) - apkWd’^f < f{x^) - + avk)\\d’^f. 

Hence, we go a longer step at x^ and make the value of function f decrease much more 
than that in the proximal point algorithm. 

The following theorem establishes the convergence and the convergent rate of Algo¬ 
rithm 1. 

Theorem 3.2 Under assumptions of Theorem \3.1\ and we further assume that Vg is lo¬ 
cally Lipschitz continuous with constant L 2 and f satisfies the Lojasiewicz inequality with 
exponent k G [0,1). If the sequence {x^} has a limit point x*, then the whole sequence 
{x^} converges to x*, which is a stationary point of f. Moreover denoting f* := f{x*), the 
following estimations holds: 

(a) If K = 0, then the sequences {x^} and {/(x^)} converge in a finite number of steps to 

X* and f*, respectively. 

(b) If K & (0,^], then the sequences {x^} and {/(x^)} converge linearly to x* and f*, 

respectively. 
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(c) If K G ( 5 , 1 ), then there exist some positive constants Ai and A 2 such that for k large 
enough, 

\\x^ — x*\\ < Aik~^^, and f{x^) — f*<A 2 k~^^. 

Proof. By Theorem 13.11 (c), we have lim f{x^) = f*. If x* is a limit point of then 

/c—>-oo 

there exists a subsequence {x^*} of {x^} which converges to x*. By continuity of /, we have 
that 

f{x*) = lim /(x^‘) = lim f{x^) = f*. 

i—>-00 fc—>-oo 

Hence / is finite and has the same value /* at every limit point of {x^}. If /(x^) = f* 
for some k > 1, then f{x^) = f{x^~^^), Vr > 0, since the sequence {/(x^)} is decreasing. 
Therefore, x^ = x^"*"^ for all r > 0 and Algorithm 1 terminates after a finite number of 
steps. 

Now we assume that f{x^) > /*, V/c. Since / satisfies the Lojasiewicz property, there 
exist M > 0, ei > 0 and k G [0,1) such that 

|/(x)-/(x*)r <M||V/(x)||, VxG]B(x*,ei). (3.7) 

Further, as V <7 is Locally Lipschitz around x*, there exist some constants L 2 > 0 and 62 > 0 
such that 


||V 5 (x)-V 5 (y)|| <T 2 ||x-y||, Vx,yGB(x*,e 2 ). (3.8) 

Let e := ^ min{ei, 62 } > 0. 

Since lim x^® = x* and lim /(x^*) = /*, we can find an index /c^ large enough such 

i^oo i^oo 

that 

By Theorem 13.11 (c), we know that d^ = — x^ ^ 0 as k ^ 00 . Then without loss of 

generality, we can assume that 


\\y^ — x^ll < e, \/k > ke- 


We now claim that, V/c > k^, whenever x^ G ]B(x*, e) the following holds 


1x^+1 - x’^W < ^-^(-^2 + ^fc)(l + ??fc) _ (/(a;fe+i) _ f*)^-^]. (3.10) 


(1-k)( 


Xk—Li 


+ ayk) 


Indeed, consider the concave function 6 : (0, + 00 ) —>■ (0, + 00 ) defined by 6{t) = t^ Then 
we have 

9{ti) - 6{t2) > V6'(fi)^(ti - t2), Vfi,f2 > 0. 

Substituting in this inequality ti by (/(x^) — /*) and t 2 by {f{x^^^ — /*) and using (13.7p 
and then (13.6p . we have 

1 — K 


(/(^fc)_/*)l--(/(^A:+l)-^*) 


*\1 — K 


> 


{f{x^) - f*y 


-(/(x")-/(x"+i)) 


l — K ,Afc — Ll M|jfc||2 

> - ,... (-^- + aVk)\\d II 


M||V/(x^)|| 

1 - K (^^^ + a%) 
M||V/(x^)|| (l + %)2 




(3.11) 
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On the other hand, by Algorithm 1, we get 


and 

Using (13.Sp . we obtain 


V5(/) - Vh{x^) + = 0 

11/ - 3:*|| < 11/ - x*=|| + ||x*= - x*|| < 2e < es- 


||V/(x'=)|| = ||V^(x'=) + V 5 (x/-Vh(x'=)|| 

= ||V5(x")-V<7(/)-Afc(/-x")|| 

< ||V5(x'^)-V<7(/)||+Afc||/-x'=| 

< (L2 + Afc)||x'^-/|| 

_ -^2 + Afc |, fc+i _ ^ku 


1 +% 

From ()3.1ip and (l3.12l) . we get 

(/(/) - r)'-" - (/(/+') - r)'“" > 


(3.12) 


1 — K 


M{^^^^)\\x^+^ - x^W (l + ??fc) 


+«??fc |, fc+l _ ^fc||2 


then we get (|3.1Up . 

From p3.10p . we have 

ii/+' - /ii < 

= + + - r)i- - (/(x"+i) - r)i-1 (3.13) 

(1 — K)(Afc — Li) 

for all k > such that x^ E B(x*,e). We prove that x^ E B(x*,e), V/s > by induction. 
Indeed, from p3.9p the claim holds for k = k^. We suppose that it also holds for k = 
fee, + 1, • ■ ■ i^e + T — 1, with r > 1. Then (|3.13p is valid for fe = fe^, fe^ + 1, • • • , fe^ + r — 1. 
Therefore 

ll^/ce+r _x*|| = “h “h * * * “h “h X^^ _X*|| 


l^/Ce+i _ ^ke+i-1 I 


< ||x^'-x*||+ ^„ 
i=l 

2M{L2 + Afc)(l + T]) 


< x'^' - xU + 


(1 - K)(Afc - Li) 






)-r)'-"-(/(/'+*)-r)'-i 


2=1 


< ||x^^ - x*|| + (/(/') - r)^"'' < €• 


(1 - K){Xk - Li) 

Now adding p3.13p from fe = fe^ to fee + r — 1, one has 


fce+r —1 


k=ke 


^k+l _ ^k\ I ^ 2Af(L2 + Aa;)( 1 + f/) ^ kf\ 

{l-K){Xk-L^) > • 


^ ||x^+^-x^||< 


(3.14) 
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Taking the limit as r —>■ oo, we can conclude from (|3.14p that 


El 

k=l 


^k +1 _ ^fc|| ^ 


(3.15) 


which means that {x^} is a Cauchy sequence. Therefore, combining with x* is a limit point 
of {x^}, we conclude that the whole sequence converges to x*. By Theorem Md), 
X* must be a stationary point of /. 

For A: > iV, it follows from (13.7p . (|3.8p and then ()3.6p . that 

<M'^\\Vf{x^)f 

< M2||V<y9(x'=) + Vg{x^) - V/i(x^)f 

= M^\\Vg{x^) - V<7(/) - - x^)f 

< M\L2 + \kf\\x^ -y^f 

M2(L2 + Afc)2 


-(/(x'=)-/(x"+i)) 


< c[(/(x" - n - (/(x'^+i) - f*)] 


(3.16) 


where C = ^ applyiiig Lemma ITT] with tk = /(x^) — /*, /r = 2k and 1 ^ = 0, 

(a)~(c) regarding the sequence {/(x^)} follow from (|3.16p . 

_ OO 

By (I3.15p . we know that TZi = Y1 ||x^^^ — x^|| is finite. Note that ||x* — x*|| < TZi by 

k=i 

the triangle inequality. Therefore, the rate of convergence of x® to x* can be deduced from 
the convergence rate of 7?.j to 0. Adding (I3.13h from i to r with i < r, we have 


7^, = hm ^ llx'^+i -x^\\< 2 M(L 2 + A)(l + r?) , _ ^ ^ ^ 

(1-«)A 

where Ti := > q. Together with p3.7l) and p3.12p . we have 

Tzf^ < ri^i/(x®)-rr 

< Mri^||v/(x')|| 

< ^^A ||x®+^ -x®|| 

< A/rj^(L2 + Ai)||x®+^ -x®|| 

< M(L2 + A)^l^(7^i-7^i+l) 

K, 

Hence, by setting T 2 := M(L 2 + X)T^~'^ > 0, the above inequality becomes 

7^P < ^2(7^i - 7^i+l). 

Now let jjL = v = T 2 , and applying Lemma [2Tl we conclude that the statements in 
(a)~(c) regarding the sequence {x^} hold. □ 
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4 Convergence of the Inertial Proximal Algorithm 


Now we consider the problem P{(p,g, h), where g and h are not necessary differentiable. In 
[i^ ]. P.E. Mainge and A. Moudafi introduce the following inertial proximal algorithm for 
solving P{ip,g,h). 


Algorithm 2 

Initialization. € M”; A,/i>0;a + /3>0;/3>0;7>0 and r > — 

Iteration k {k = 0, 1, 2, ■ ■ ■). Having x^, do the following steps: 

Step 1. Compute G dh{x^)] 'S/(p{x^) 

Step 2. = {I + \dg)~^{x^ — X{'Vip{x’^) — q^) — g{ax^ + fiy^)) 

Step 3. Compute = y^ — ^[ax^ + fiy^ + 7 • — x^)], 

where p = 1 + t/3 + and go to Iteration k with k is replaced by k Pl. 

Set a = 2+a+p ^ ~ 2+a+ii (5 > 0 , consider the discrete energy Ek{5) of 

Algorithm 2 defined by 

Ek{5) = 6f{x^) + ]^\\ax^+ hy^\\\ 


The following theorem was proved in 



Theorem 4.1 (Theorem 3.2 "Ml) Assume that (3 > 0, a + (3 > 0; t > —7 > | and 
g, h are proper lower semicontinuous convex functions on M"', p is a differentiable function 
on M"' with Li-Lipschitz continuous gradient, for some Li G (0,+oo) and X, fi verify 

XLi + + p) < 1. 


Then {x^} and {y^} generated by Algorithm 2 satisfy the following properties: 


(a) For S G [^(v^ - VaT+h)'^, -^{Vh + vWT^)^], where 02 = a[l + 26(7 - i)], 

62 = b[l + b{T — ^)], the energy {^^(<5)} is a decreasing and converging sequence. 

(b) lim f{x^) exists, 
k^oo 

(c) lim — x^ll = lim — y^|| = 0. 

/c—>-oo fc^oo 

(d) lim \\ax^ +/3y^\\ = 0. 

k^oo 

(e) If {x^} and {q^} are bounded, then every cluster point x* of the sequence {x^} is a 

critical point of the function f. 


When h is differentiable, Algorithm 2 becomes the following algorithm. 

Algorithm 3 

Initialization. x°, G M”'; A,p>0;a + /3>0 ;/3 >0;7>0 and r > — 
Iteration k {k = 0, 1, 2, ■ ■ ■). Having x^, y^ do the following steps: 

Step 1. Compute V(p(x^) and V/i(x^), 
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Step 2. Solve the following strongly convex program 

min{ 5 (x) — (V/i(x^) — Vip{x^),x — x^) + ^{ax^ + Py^,x — x^) + yII® “ 
xgR** A A 

to get the unique solution 

Step 3. = y^ — ^[ax^ + /3y^ + 'ya{x^^^ — x^)], 

where p = 1 + r/3 + and go to Iteration k with k is replaced by k + 1. 


Set (5i = (o 2 + b2)/3 and z = (x, y) E M"' x M”', define the function : M"' x M” ^ M as 
follows 

(/)(z) = 4>ix,y) = 6if{x) + -||ax + byf. 

The following theorem establishes the convergence of {x^}. 


Theorem 4.2 Under assumptions of Theorem f.l, we further assume that \/h is Lipschitz 
continuous with constant L 3 and (p{z) is K-L function, then 


(a) - i(2o2 + 62 )I|x‘’+‘ -x^f- |||/+' - y^f 

where z^ = {x^,y^), \/k. 

(b) //{x^} has a limit point x*, then the whole {x^} converges to x*. 

(c) If in addition, the function 9 in the K-L inequality has the form 9{t) = Mt^~^, then 

we have the following 

(i) if K = 0, then the algorithm terminate in a finite steps, 

(a) if 0 < K < ^, then there exist x4i > 0 and ( E (0,1) such that 

llx'^ -x*|| < x4lC^ 


(in) if h < K < 1, then there exists x42 > 0 such that 


x^ — x*|| < x42A;i-2«; . 


Proof, (a) From inequality (3.14) in [2^, we have 

Ek+,i6) - Ek{5) + - x^f + 62 II/+' - /f 


+(02 - 5^)(x^+i - x^ - /) < 0. 
A 


(4.17) 


Replacing 


^xA=+l_xA:,/+l_/) = _l||(x^+l-x")-(/+l-/)f + l||x'^+l-x'^f+ 1||/+1-/||2 
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into (j4.17l) . we get 

Et+iiS) - Et{S) + ife + if )l|i‘+‘ - + fe + f - if )ll!/‘+‘ - 

+ 5 (if - n 2 )ll(a:‘+‘ -- (/+' - /)f < 0 

In particular case, when d = di = (02 + ^ 2)^5 (14.1811 becomes 
Ek+,{Si) - Ek{6i) + ^(202 + b2)\\x^+^ - x’^f + 

or 

i.(2'‘+') < - i(2a2 + fc)||i*+> - x^f - f |!,‘+‘ - /IP 


(b) By setting a = ^ min{ 2 a 2 + 62 ; ^ 2 } > 0. The inequality (I4.19P implies that 

- z^f. 

Therefore lim (j){z^) = f* does exist and lim — z^\\ = 0. 

fc^oo /c—>-oo 

Set co^ = 5i(j)^ — Vh{x^) + 'S/i^{x^)) + ax^ + by’^, where G dg{x^), and 
ujy = ax^ + by^. Then 

G d^(t>{z^). 

By the dehnition of Algorithm 3, we have that 

0 G dg{x^) + \{x’^- - Vh{x^-^) + V^(x^-^) + 

A A 

which implies that 

/ = - x^-i) + V/i(x^-i) - V<y9(x^-^) - ^(ax^-^ + 

A A 

Hence 

o;^ = (,5i(V/i(x^-^) - Vh{x’^) + Vp{x^) - Vp{x^-^)) - ^(x^ - x*^"^) 

A 

- (5i^(ax*^-i + ,5/-^) + ax*^ + hy^,ax^ + 6 /). 

A 

Therefore 

llw^ll < 5i(||V/i(x^-^) - V/i(x^)|| + llVf/jPx'^) - V</j(x'=-^)||) + ^||x^ - x^-^ 

A 

+ + /3/-i + 2||ax'= + by% 

A 


(4.18) 


(4.19) 


(4.20) 

set also 


(4.21) 
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On the other hand, from the Lipschitz continuity of V(^ and V/i as well as Algorithm 3, we 
have 


\\V^ix^) - < LiWx’^ - x^-^\\, 

||V/i(x^-^) - V/i(x^)|| < LsWx^ - 

+ /3/-i|| = II - p(/ - /-i) - 7a(x^ - x^-^)|| 

<p||y'=-/"'ll+7a|k'=-x"-i||, 

and 

||ax^ + 6/11 = ||a(x^ - x^-^) + 6(/ - /-^) + ax^"^ + 6/-i|| 

< a||x^ - x^-^\\ + 611/ - /-^ll + ||ax^-^ + 6/-i|| 

^ + ^\\y" - + p\\y" - y '~'\\+ 

z + a + p 

= 2 + 1 +p ^ 7a)l|/ - /"i + (/3 + /II/ - /"^ll] • 

So (I4.2ip yields 


Set 


I kn ,-r T 1 ^ N dfa + ay), 

1“ ll<(«i(ri+r3 + X+A“T'>+2 + a + ^)' 

, ^ ^{P + p) N|| fc _ fc-l|| 

^ ^ A 2 + a + /3^"^ ^ " 


Ix'^ -x'^-^l 


^ f /T' r 1 Ai X 4(a + ay) 

C, - h(Li + L3 + J + jaj) + 2 + „ + ^ 

C. = ^,f: + iA±A, and 
A 2 -|- G! -|- p 


Then we deduce from the above inequality that 


|/|| < C||z^ -z' 


k-h 


(4.22) 


Therefore 


d(0; a^/z/) < C||z^ - z^-/. (4.23) 

Suppose that {x*^'} C {x^} and x^* ^ x* as z —)> oo, then from Theorem 14.11 (d), we get 

lim = y* such that ax* + /3y* = 0. Therefore {z^*} = {(x^%/*)} ^ (x*,y*) = z* as 
2—>-00 

i ^ oo. Since (p is K-L function, there exist C > 0, a neighborhood V of z* and a continuous 
concave function 6 : [0, C) —t [0,+oo) such that 0(0) = 0 and Vz G Viz*) satisfying 

/</z)</ + C 


we have then 


6 '{p{z) — (/)*)(i(0; (p{z)) > 1. 


(4.24) 
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Let e > 0 such that B( 2 :*;e) C V(z*) where B(z*;e) is a ball centered at 2 ;* and radius e. 
Since lim = z*, lim \\z^~^^ ~ = 0; 1™ (ji^z^) = (j)* and (j){z^) > (^*, V/c, we can find 

i—^oo k—yoo /c—>-oo 

a positive integer number fee such that 


and 


z^^ eM{z*]e), (j)* <(p{z^^) <^*+ C, 


1 ^^^ - z*|| + ^^ - </>*) < 


(4.25) 

(4.26) 


where a = ^. 

a 

We observe that if fe > fe^, z^ E B( 2 ;*, e) and cff < 4>{z*) <(/>* + C, then 

+ u[0(</>(z")-,/.*)-0(0(/+i-,(>*)], 


\\r^k—l ^k\ 


(4.27) 


Indeed, since 6 is concave on [0; Q, we have 

9{t) — 9{s) > 9'{t){t — s), Vt, s E [0, (■)• 

Replacing t = 4>{z^) — (j)* and s = (f){z^~^^) — 4>* and combining with (|4.23p . (I4.24p and 
(I4.20p . we get from the above ineqnality that 

C||/-i - /||[0(<(>(/) - <P*) - 0((()(/+') - <P*)] > d{0;d^^^iz^))9’i<Piz’^) - ct^*mz^) - <()(/+')) 

> <()(/) - <()(z"+i) 


Hence, 


9{ct>{z^) - r) - 9{cl>{z^+^) - r) > 


> a\\z^ - 


a Hz'" - 
C \\z^~^ — z^W 


> - Hz'' - z" 
a 


|zfe-i _zfc| 


where the last inequality comes from that fact that 


\^k ^k+l\\2 lUfc—1 ^k\\ 

1^ ^ II I 11^ ^ II > \\yk _ ^fc+li 

II h — ^ ^ II A — K' 

Wz'^ i _ 4 


So we get (I4.27P . 

We next show that z^ E B(z*,e) for all fe > fee by indnction. Indeed, it deduces from 
p4.25p that z^" E B(z*,e). Suppose that z^% E B(z*,e) for some r > 1, 

we need verifying that z^'+^ E B(z*,e). We get from (|4.20l) and (|4.25l) that 

4>* < (j){z^) < 4>* + C, y k > k^. 

Using inequality (I4.27h for fe = fee, fee + 1, • • • , fee + r — 1, we have 


_ ^ke + l I 


Iz^e + l _ 


< - </>*) - 9icPiz^^+^) - </.*)] 


< 


4 

iz""' — z" 


+ a[9{cP{z^^+^) - r) - 9{cl>{z^‘+^) - r )] 


I fce+r-l _ ^ 


l^fce+r—2 _ ^fce+r—1| 


+ a[9{4>{z^^+^-^) - <P*) - 9{cj){z^^+n - </>*)]■ 


ke+r\ 
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Hence 


E 

2=1 


1 _ 1 1 

l^fee+i _ < L 11 ^^'“*“* — + — 11 ^^' — — ^^"+'^“^1 


2=1 


4„ „ 4 , 

+a[0(,/.(z"')-0*)-0(</)(/'+^)-,/.*)]. 


Therefore 


+ (4.28) 


2 = 1 


It is clear that 

r 

\\z’^^+^ - z*\\ < \\z^^ - z*\\ + Y^ 

2=1 

using (|4.28p and ()4.26p . it implies that 

4 r — Z^^~^ " 


l^fce+i _ ^fee+i—1 I 


|^fc.+r_ *11 < zrL-^ 

I " - 3^ 4 


+ o- 6 l((/.(z^'') - (p*)] + 11 /'^ - z*' 


< e. 


Thus 2;^'=+^ G B(2;*,e). So G B(2;*,e) for all k > k^. 

Because z^ G 8 ( 2 ;*,e) and (p* < (p{z^) < (p* + C,, ^k > /se, the inequality (|4.28l) holds 

00 

for all r. Consequently, the series ^ \\z^^^ — z^\\ is convergent, i.e., {z^} is a Cauchy 

k=l 

sequence, therefore lim z^ does exist, combining with lim 2 ;^' = 2 ;*, we get lim z^ = z*. 

k—>-oo i —^00 k—yoo 

Hence lim = x*. 

k^oo 

Now we prove the assertion (c). For each A; > 1, set TZk = ~ ^^W- 

Since lim z^ = z* = {x*,y*) with ax* + fiy* = 0, it implies that \\z^ — z*\\ < TZk- 
k^oo 

By the assumption (c), the inequality (I4.24p becomes 


M(1 - K)i(p{z) - (/>*)"'"d(0; d^(p{z)) > 1 (4.29) 

Combining with (I4.23h we get 

M(1 - k){(P{z^) - (P*)-^C\\z^ - /-i|| > 1. 

Hence, 

{(p{z^) - (p*f-‘^ < (M(l - k)C)^\\z^ - z^-^\\^. (4.30) 

For all A: > fee, it follows from (I4.28h that 

nk<\\\z^-z^-^\\ + ^{cP{z^)-cP*f-\ 

3 6a 

Combining with (|4.30p we get. 
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< Ti{TZk-i — TZk) + T2{TZk-i — Tlk)^ 

1 K 

where Ti = i,T 2 = ^(M(1 — k)C)~^ . Because ||x^ — x*|| < Wz’^ — z*\\, by the same 
argument with the proof of Theorem 2 in [^, we get the conclusion of assertion (c). □ 


5 Conclusions 

We have proposed an algorithm called boosted proximal point algorithm for solving non- 
convex minimization problem of the form P(ip,g,h) : min{/(x) = ^(x) + g{x) — h(x)}, 

where cj) is differentiable and g, h are convex functions. This algorithm is nothing but the 
combination of the proximal point algorithm and the descent direction algorithm [l5| . 
We then prove the global convergence and the convergent rate of this algorithm and the 
inertial proximal point algorithm proposed by P.E. Mainge and A. Moudafi for solving 
P{ip,g,h). 
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